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MaxCal  is  just  The  Principle  of  Maximum  Entropy  (MaxEnt)  where  constraints  are  changing  in  time.  This  simply  amounts  to  an  additional 
step  to  summarize  the  microstate  (from  MaxEnt)  as  it  changes. 
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the  general  theoretical  distributions  that  include  both  non-equilibrium  path  information  and  observational  information.  This  is  simply  an 
application  and  I  include  several  examples  to  illustrate  this  application. 
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Summary  of  project 

The  scientific  objective  was  to  extend  MaxCal  [1]  to  include  microscopic  (data)  information  by  deriving  the 
explicit,  general,  theoretical  distributions  that  include  both  non-equilibrium,  path  trajectory  information  (as 
done  in  MaxCal)  and  direct,  observational  microscopic  information. 

Toward  this  end,  the  first  task  was  to  reproduce  MaxCal  and  understand  the  logic  and  motivation  behind 
it.  MaxCal  is  just  The  Principle  of  Maximum  Entropy  (MaxEnt  [2])  where  constraints  are  changing  in  time. 
This  simply  amounts  to  an  additional  step  to  summarize  the  microstate  (from  MaxEnt)  as  it  changes.  This 
does  not  need  to  even  be  a  temporal  change;  however  MaxCal  was  conceived  originally  with  constraints 
changing  in  time.  However,  the  key  point  is  that  the  main  algorithm  is  unchanged.  In  other  words,  it  is  still 
"MaxEnt"  and  as  Jaynes  himself  said,  "...it  applies  equally  well  to  any  physical  quantity  whatsoever."  [3].  I 
would  add  that  it  applies  equally  well  to  any  information  in  the  form  of  an  expectation  value  (macroscopic 
information)  whatsoever.  It  is  simply  a  different  application  of  the  MaxEnt  algorithm.  Jaynes  renamed  it 
"MaxCal"  to  simply  highlight  the  emphasis  on  the  flux  as  opposed  to  the  state.  Therefore,  I  assert  that 
MaxCal  is  a  special  case  of  MaxEnt,  and  not  the  other  way  around. 

As  is  stated  in  the  previous  paragraph,  the  MaxEnt  algorithm  and  therefore,  MaxCal,  are  only  applicable 
for  information  in  the  form  of  expectation  values.  Microscopic  information  does  not  have  a  place  in  it.  How¬ 
ever,  macroscopic  information  is  in  essence  a  summary  of  microscopic  information.  Indeed,  the  probability 
distributions  of  both  Gibbs  and  Boltzmann  describe  the  ’microstates’  of  the  system.  Therefore,  it  makes  log¬ 
ical  sense  that  microscopic  information  would  shape  these  distributions.  Maximum  relative  Entropy  (MrE) 
[4]  was  shown  to  be  a  generalized  algorithm  that  includes  both  MaxEnt  and  Bayes  Rule  (which  handles 
microscopic  information  such  as  data),  as  special  cases. 

To  address  the  main  objective  of  the  project,  the  next  task  was  to  show  that  MaxCal  is  a  special  case  of 
MrE.  This  step  is  trivial  once  one  understands  the  preceding  comments;  MaxCal  is  a  special  case  of  MaxEnt 
and  MaxEnt  is  a  special  case  of  MrE.  Therefore,  MaxCal  is  a  special  case  of  MrE.  However,  although  this 
was  the  spirit  of  the  project,  the  specific  objective  was  to  determine  the  general  theoretical  distributions  that 
include  both  non-equilibrium  path  information  and  observational  information.  This  is  simply  an  application 
and  I  include  several  examples  to  illustrate  this  application. 

Finally,  since  someone  will  undoubtedly  claim  a  version  of  MrE  that  includes  time  based  constraints  is 
’new"  in  the  sense  that  MaxCal  is  "new".  I  will  name  this  special  case  of  MrE,  MrE(t). 


Detailed  progress  and  results 

Results  of  this  program  can  be  broken  into  four  areas 

•  Examine  the  MaxCal  algorithm  and  motivation  for  it 

•  Derive  general  MrE  application  example  of  non-equilibrium,  path  trajectory  information,  MrE(t) 

•  Illustrate  MrE(t)  with  several  explicit  examples 

•  Current  and  future  directions 


MaxCal  algorithm 

It  needs  to  be  stated  over  and  over  until  it  is  well  understood  that  MaxEnt  is  a  method  of  inference.  It  applies 
equally  well  to  any  information  in  the  form  of  an  expectation  value  (macroscopic  information)  whatsoever. 
Therefore,  there  is  no  notion  of  equilibrium  or  ergodicity  built  into  it. 


2 


Background 


Boltzmann  assumed  many  things  when  creating  his  "entropy"  or  better,  his  distribution  of  microstates  that 
a  single  particle  to  be  in.  For  example,  he  needed  to  create  the  famous  ergodic  theorem  so  that  he  could 
justify  each  microstate  being  equally  likely  [5].  This  allowed  him  to  use  a  multiplicity  to  describe  the  number 
of  particles  in  a  particular  macrostate.  He  also  assumed  that  the  particles  did  not  interact.  Although  he 
did  not  state  it  explicitly,  by  action,  he  also  assumed  the  only  information  he  had  was  the  total  energy  of 
the  system.  Using  these  pieces  of  information  and  assumptions  in  the  MaxEnt  algorithm,  one  obtains  the 
microcanonical  distribution  of  microstates,  ft.  It  is  valid  only  at  equilibrium  because  the  assumptions  used 
to  create  it  are  only  valid  at  equilibrium. 


fi 


(1) 


where,  Z, is  the  partition  physics  and  e2;  is  an  energy  state  and  ft  is  the  Lagrange  multiplier  that  turns  out 
to  be  inversely  proportional  to  the  temperature.  Following  this  yields  his  entropy, 
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where  N  is  the  number  of  particles.  For  the  continuous  case  in  momentum  space, 


Sb  =  —  N 


d3xd3p  f(x,p)  log  f(x,p)  . 


(3) 


Gibbs  took  a  different  approach  [6].  He  created  a  variational  principle  to  determine  the  distribution  of 
the  system  at  equilibrium  as  was  his  intention.  However  the  variational  principle  itself  has  nothing  to  do 
with  an  equilibrium  state.  It  is  this  variational  principle  that  is  the  true  engine  behind  MaxEnt.  Using 
this  and  not  assuming  collisionless  particles  led  to  his  version  of  the  equilibrium  microstate.  This  led  to  his 
distribution, 


fi 


(4) 


which  is  a  state  of  the  entire  system  of  particles  with  Ei  as  the  energy  function.  This  can  clearly  be  seen  in 
the  entropy  form, 


S[f]  =  ~  j  d3Nxd3Np  f(x,p)  log  f(x,p)  . 


(5) 


This  is  the  basis  of  traditional  statistical  mechanics  which  is  defined  at  equilibrium.  This  has  been  exper¬ 
imentally  verified  many  times  over.  However,  the  key  insight  into  understanding  this  result  is  that  it  is 
also  valid  for  systems  far  from  equilibrium.  Just  because  Gibbs  only  used  information  that  is  very  valid 
at  equilibrium,  such  as  the  average  energy,  does  not  mean  that  this  is  any  less  valid  for  systems  far  from 
equilibrium.  However,  if  this  was  used  to  predict  the  state  of  such  a  system,  it  would  give  terrible  results. 
Maybe  the  results  would  be  so  bad  that  a  random  selections  provides  the  same  results.  In  inference,  we 
attribute  this  to  "noise"  in  the  system  and  that  the  noise  overwhelms  the  predictive  ability  of  the  model. 
Thus,  while  it  may  be  a  terrible  model  for  a  non-equilibrium  system,  it  does  not  mean  it  is  "wrong".  It 
simple  is  not  enough  information  to  describe  the  state  to  the  necessary  precision. 


MaxEnt  and  MaxCal 

The  MaxEnt  algorithm  can  be  defined  in  the  following  way.  The  correct  entropic  form  is, 

S\p]  =  Pi  log  Pi  ,  (6) 

i 

where  pi  is  the  probability  of  microstate,  i,  with  normalization  constraint, 

I>  =  1’  (7) 

i 

and  general  macroscopic  constraint, 

YJPif\xi)  =  Fk  ,  (8) 
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where,  fk(xi)  is  the  kth  function  that  is  constraining  the  probability  and  Fk  is  the  value  of  the  expectation 
value  to  be  constraining  the  posterior.  Maximization  of  the  entropy  with  constraints  over  the  index  "k"  is, 


S  [p]  =  S  [p]  -  a  (  ^Pi  -  1 )  -  E  (  Xk  (  E Pif k  (xi)  ~  pk 


(9) 


which  yields, 

i  -J2(Xkfk^) 

Pi  =  -r?e  k  ,  (10) 

where  Z  is  the  typical  partition  function. 

The  MaxCal  algorithm  can  be  defined  in  the  following  way  from  above.  Let  k  be  a  temporal  index.  We 
can  rewrite  the  constraint  (8)  as, 


E>/(^)  =  F  . 


Maximization  of  the  entropy  is, 


S\p] 


E 


Pi 


EA  (k)fk(xi)]-F 


(11) 


(12) 


which  yields, 


(13) 


where  the  Lagrange  multiplier,  A (k)  enforces  each  discrete  "time"  step  and  now  pi  is  the  probability  of  the 
microtrajectory  with  index  "i".  A  common  objection  is  that  if  one  time  step  is  at  "equilibrium",  how  can  it 
have  another  time  step  after?  The  answer  follows  from  the  argument  presented  above;  it  was  never  stated 
that  the  system  was  ever  at  equilibrium.  The  MaxEnt  algorithm  is  silent  on  this  issue.  Is  this  function 
valid  for  systems  far  from  equilibrium?  It  depends  on  the  precision  needed.  For  greater  precision,  more 
information  may  be  needed,  i.e.  more  constraints.  Note,  that  if  the  relative  entropy  was  used  the  result 
would  be, 

i  -(E(a(wm) 

Pi  =  V  fe  /  ,  (14) 

where  (p  is  the  "prior". 


MrE(t) 

Here  the  general  MrE  application  example  of  non-equilibrium,  path  trajectory  information,  MrE(t),  is  de¬ 
rived. 


MrE 

Maximum  relative  Entropy  can  be  written  as  the  Maximum  relative  Entropy  (MrE)  [4]  method  is  designed  to 
update  from  a  prior  to  a  posterior  distribution  on  the  basis  of  three  pieces  of  information:  prior  information 
about  9  (the  prior),  the  known  relationship  between  x  and  9  (the  model),  and  the  observed  values  of  the  data 
x  €  X .  Since  we  are  concerned  with  both  x  and  9 ,  the  relevant  space  is  neither  X  nor  0  but  the  product 
X  x  Q  and  our  attention  must  be  focused  on  the  joint  distribution  P(x,9).  The  selected  joint  posterior 
Pncw(x,9)  is  that  which  maximizes  the  entropy1, 

S[P,  Poid]  =  -  [  P  (a,  0)  log  ^{X:e\dxd9  ,  (15) 

J  ^Old  (Z,  0) 

1In  the  MrE  terminology,  we  "maximize"  the  negative  relative  entropy,  S  so  that  S  <  0.  This  is  the  same  as  minimizing  the 
relative  entropy. 
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subject  to  the  appropriate  constraints  (parameters  can  be  discrete  as  well).  PG y  (x,8)  contains  our  prior 
information  which  we  call  the  joint  prior.  To  be  explicit, 

Poid  (x,  9)  =  Poid  (x)  Poid  (9\x)  ,  (16) 

where  P0i<j  (x)  is  the  traditional  Bayesian  prior  and  P0id  (0 \x)  is  the  likelihood.  It  is  important  to  note  that 
they  both  contain  prior  information.  The  Bayesian  prior  is  defined  as  containing  prior  information.  However, 
the  likelihood  is  not  traditionally  thought  of  in  terms  of  prior  information.  Of  course  it  is  reasonable  to  see  it 
as  such  because  the  likelihood  represents  the  model  (the  relationship  between  9  and  x)  that  has  already  been 
established.  Thus  we  consider  both  pieces,  the  Bayesian  prior  and  the  likelihood  to  be  prior  information.  It 
should  be  noted  that  Shore  and  Johnson  [7]  never  make  this  connection. 

The  new  information  is  the  observed  data ,  9' ,  which  in  the  MrE  framework  must  be  expressed  in  the  form 
of  a  constraint  on  the  allowed  posteriors.  The  family  of  posteriors  that  reflects  the  fact  that  9  is  now  known 
to  be  9'  is  such  that 

pW  =  /p  m 

where  <5  (9  —  9')  is  the  Dirac  delta  function  (or  a  Kronecker  delta  for  the  discrete  case).  This  amounts  to  an 
infinite  number  of  constraints:  there  is  one  constraint  on  P  (a:,  9)  for  each  value  of  the  variable  x  and  each 
constraint  will  require  its  own  Lagrange  multiplier  A(:r).  Furthermore,  we  impose  the  usual  normalization 
constraint, 

J  P  (x,  9)  dxd9  =  1  ,  (18) 

and  include  additional  information  about  9  in  the  form  of  a  constraint  on  the  expected  value  of  some  function 

no), 

J  P(x,9)f(x)dxd9  =(f(x))=F  .  (19) 

The  final  step  is  to  marginalize  the  posterior,  Pnew(x,6)  over  x  to  get  our  updated  probability, 

e/3  m 

PneAO)  =  Pou{x^)—r-  (20) 

C \0  ,p) 

and  (j(9',f3 )  is  the  partition  function  and  /3  is  the  Lagrange  multiplier. 

MrE(t) 

In  the  above  example,  no  mention  of  equilibrium  was  ever  made.  Now  a  temporal  function  is  explicitly 
employed  in  the  constraint, 

J  P  (x(t),  9(t))  (/  /  (x(t),  t )  dt^j  dx{t)d9(t)  =  (/  (x(t)))  =  F  .  (21) 

and  therefore  the  final  distribution  would  be, 

eif  9{t)f{x{t),t)dt) 

Pnevj(x(t))  =  P0\<i(x(t),  9(t)  )  f3(t))  (^) 

where  fi[t)  is  an  infinite  amount  of  Lagrange  multipliers  for  the  time  constraints.  To  provide  a  direct 
comparison  to  the  discrete  MaxCal  above,  we  have, 

i 

Pi  =  <lij>2e  '  k  '  ,  (23) 

where  the  index  "  j "  is  for  the  observable  constraint  which  in  this  case  is  not  a  Dirac  delta  function  but  a 
Kronecker  delta  function,  5jj>,  qij>  is  the  discrete  Bayesian  prior  and  the  partition  function,  Z,  is  a  function 
of  the  observable  index  "  j "  as  well. 

Examples 

Here  I  provide  a  few  examples  to  illustrate  MrE(t). 
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General  particle  in  motion 

Let  a  particle’s  path  in  time  be  defined  by  a  continuous  function,  f(x,x',t).  The  constraint  that  would  be 
employed  in  MaxCal  is, 


J  P{x{t))  (  J  f(x,x;t)dtj  dx(t)  =  {f(x,x;t)) 


(24) 


with  distribution, 


P{x(t))  = 


o~  f  A (t)f(x,x;t)dt 


(25) 


where  A (t)f(x,x',t)  is  the  classical  Lagrangian  and  the  integration  produces  the  action.  In  contrast,  MrE(t) 
would  be, 


Pnevj  =  poU(x(t),0(ty) 


g-  I  A (t)f(x,x\t)dt. 


where  fi{t)  is  the  Lagrange  multiplier  and  ,  f3(t))  is  the  partition  function. 


(26) 


Specific  particle  in  motion  example  (ID) 

Let’s  assume  that  the  motion  was  very  simple  in  that  the  particle  has  constant  velocity,  v,  no  potential 
energy  acting  on  it  and  is  moving  in  one  dimension.  The  action  over  the  time  interval  [0,  t\  would  then  be 

x(t)  =  a;(0)  +  vt  (27) 

and  so  the  distribution  of  the  paths  with  MaxCal  would  be, 

-x(0)-vt 

PW1»  =  w  (28> 

and  the  MrE(t)  solution  would  be, 


e—x(0)—vt 

Pnew(x(t))  =  P0ld(x(t),6(t)  )  (29) 

where  the  Bayesian  prior,  Poid{x{t),0(i)')  is  some  function  that  relates  the  position  with  some  other  ob¬ 
servable.  Perhaps  a  magnetic  field.  However,  while  the  microscopic  observables  certainly  change  the  shape 
of  the  distribution,  the  mean  of  the  position  is  assumed  to  be  known  in  order  to  determine  the  Lagrange 
multipliers.  Therefore,  the  microscopic  observables  do  not  have  any  influence  on  the  mean,  ( x(t )) 


Specific  particle  in  motion  example  (2D_1) 

Let’s  assume  that  the  motion  was  very  simple  in  that  the  particle  has  constant  velocity,  v,  no  potential 
energy  acting  on  it  and  is  moving  in  two  dimensions.  The  action  over  the  time  interval  [0,  t]  would  then  be 


x(t)  =  x(0)  +  vxt 
y(t)  =  y(0)  +  Vyt 


and  so  the  distribution  of  the  paths  with  MaxCal  would  be, 


P(x(t)) 


e-x(0)-vxt-y(0)+vyt 

Z(XX(t),Xy(t)) 


(30) 

(31) 


(32) 


and  the  MrE(t)  solution  would  be, 


Pnew{x(t),y(t)) 


Po\d(x(t),y(t),0x(ty,0y(tY) 


e—x(0)—vxt—y(0)+vyt 

C  (ex(ty,ev(ty,m) 


(33) 


where  the  Bayesian  prior,  P0\d{x(t),  9(t)')  is  some  function  that  relates  the  position  with  some  other  observ¬ 
able. 
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Specific  particle  in  motion  example  (2D_2) 

Now  let’s  assume  that  we  do  not  know  the  means  ahead  of  time,  which  is  a  much  more  plausible  scenario. 
This  is  especially  true  if  the  position  is  not  well  observed  in  time,  i.e.  the  sample  mean,  x(t)  would  be  a  very 
poor  estimate  of  the  mean  ( x(t )) .  However,  we  may  know  some  other  important  information  regarding  the 
velocities.  Such  as  that  the  moment  is  conserved.  This  means  that  on  average  over  a  long  time,  the  velocity 
components  will  be  related  in  some  way, 

(x{t))  =  g0  ( y{t ))  (34) 

where  go  is  the  ratio  of  the  average  of  the  initial  velocity  components.  In  this  case  the  constraint  would  be, 

J  P  {x(t),y{t))  (/  [f(x,  x;  t)  -  g0f(y,  y\ t)]  d?j  dx{t)dy(t)  =  (f(x,  x;  t))  -  g0  ( f(y ,  y\ t))  (35) 

Now  the  prior  and  thus  the  observables  will  influence  what  the  mean  of  the  system  is  can  be. 


Flux  of  states 


Here  I  examine  the  Ehrenfest’s  famous  "dog-flea"  model.  In  this  discrete  case,  fleas  either  jump  off  a  dog  or 
stay  on.  This  can  be  seen  as  a  flux  of  fleas.  When  this  is  combined  with  other  dogs  who  are  exchanging  fleas, 
it  turns  into  a  simplified  version  of  many  real  world  physical  systems.  However,  for  our  example,  I  will  limit 
myself  at  first  with  the  flux  of  one  dog.  Further,  I  will  describe  the  systems  as  a  set  of  coins,  where  each 
flea  is  a  coin  and  in  a  state  of  1  for  jumping  off  or  0  for  staying  on.  Let  the  index  denote  a  microstate 
of  the  system  with  N  microstates.  For  example,  if  there  were  4  fleas  on  the  dog  or  4  coins,  there  would  be 
16  =  24  microstates.  We  we  knew  the  average  number  of  jumps  or  heads  (changes),  (m),  the  entropy  to  be 
maximized  would  be, 

S  \p\  =  -^Pilog  pi  -a^pi  -  A Piirij  ,  (36) 

i  i  i 

and  after  some  manipulations,  the  distribution  for  the  microstates,  would  be  converted  to  the  distribution 
of  the  jumps,  where  the  binomial  distribution  would  be  produced, 

p(m)  =  f!  vpm(l-p)(jy-m)  •  (37) 

to!  (TV  —  my. 

We  can  extend  this  to  more  than  a  two  state  system,  such  as  instead  of  a  coin,  we  have  a  3  sided  die.  Or 
the  states  are  described  by,  —1,  0, 1  that  might  represent  two  other  dogs  that  the  flee  can  jump  too.  With  4 
flees  we  would  now  have  34  =  81  microstates.  In  that  case  the  result  would  be, 


N\ 


rri\\m2\(N  —  m\  —  m2)\ 


pT'p™2 


( l~Pi-P2){N~mi~m2) 


(38) 


This  once  again  assumes  we  know  both  the  means  of  the  two  dimensions  of  this  2D-simplex  (m3  =  N  —  mi  — 
m2).  If  this  is  not  the  case,  and  we  employ  a  similar  constrain  as  in  the  practical  example,  we  would  have, 


S  \P\  =  -  Pi  log  Pi  ~  a  X] Pi  -  ^  5Z  Pi  (TOli  “  9  m2i) 

i  i  i 

with  some  prior  that  relates  some  observables  to  the  m's.  For  a  more  detailed  example  of  a  more  complicated 
system,  see  ([8]).  These  examples  can  immediately  produce  new  views  of  Fick’s  law  of  diffusion,  Fourier’s 
law  of  heat  flow,  the  Newtonian  viscosity  law,  and  the  mass-action  laws  of  chemical  kinetics. 


Final  thoughts  and  future  work 

In  [9]  we  adopted  a  consistency  axiom  similar  to  that  proposed  by  Shore  and  Johnson  [7].  When  two  systems 
are  independent  it  should  not  matter  whether  the  inference  procedure  treats  them  separately  or  jointly.  The 
merit  of  such  a  consistency  axiom  is  that  it  is  very  compelling:  it  is  difficult  to  advocate  any  other  alternative. 
Nevertheless  this  axiom  has  been  criticized  by  Karbelkar  [10]  and  by  Uffink  [11],  In  their  view  it  fails  to  single 
out  the  usual  logarithmic  entropy  as  the  unique  tool  for  updating.  It  merely  restricts  the  form  of  the  entropy 
to  a  one-dimensional  continuum  labeled  by  a  parameter  77.  The  resulting  77-entropies  are  equivalent  to  those 
proposed  by  Renyi  [12]  or  by  Tsallis  [13]  in  the  sense  that  they  lead  to  the  same  updated  probabilities. 
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The  main  result  of  [9]  was  to  go  beyond  the  insights  of  Karlbelkar  and  Uffink,  and  show  that  the 
consistency  axiom  selects  a  unique,  universal  value  for  the  parameter  ?y  and  this  value  corresponds  to  the 
usual  logarithmic  entropy.  The  advantage  of  our  approach  is  that  it  shows  precisely  how  it  is  that  the  other 
77-entropies  are  ruled  out  as  tools  for  updating.  This  is  particularly  important  as  the  two,  recent,  "hot" 
articles  cited  as  the  state  of  the  art  above  [14,  15]  use  Shore  and  Johnson  as  their  foundation.  We  have 
already  addressed  the  criticism  that  will  be  applied  to  those  articles. 

The  ideas  of  MaxEnt  have  evolved  as  well.  MaxEnt  was  designed  to  assign  rather  than  update  prob¬ 
abilities.  However,  if  information  is  given  in  the  form  of  data,  then  the  proper  method  for  inference  was 
Bayes  theorem.  The  method  of  Maximum  relative  Entropy  (MrE)  [4]  is  capable  of  reproducing  every  aspect 
of  orthodox  Bayesian  inference  and  proves  the  complete  compatibility  of  Bayesian  and  Maximum  Entropy 
methods.  However,  it  also  opens  the  door  to  tackling  problems  that  could  not  be  previously  addressed 
by  either  the  MaxEnt  or  orthodox  Bayesian  methods  individually,  such  as  inferring  parameters  from  both 
constraints  and  data  simultaneously.  This  fundamentally  changes  the  inference  landscape  as  now  there 
is  no  separation  of  the  microscopic  (measurements,  observations,  data)  and  the  macroscopic  (constraints, 
moments,  Hamiltonians,  averages,  etc.). 

Besides  extending  the  many  above  examples,  there  are  many,  many  uses  of  MaxCal  that  can  be  exploited 
with  the  generalized  algorithm,  MrE(t).  Fluids  and  gas  dynamics  are  two  applicaitons  that  would  be  very 
exciting  to  pursue  as  would  biological  applications  such  as  protein  folding.  All  of  these  have  direct  relevance 
to  Army  objectives  in  complex,  nonlinear,  dynamical  systems,  fluid  dynamics,  self-organizing  systems  and 
large  scale,  autonomous,  dynamical  networks. 
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